Search CORE

159 research outputs found

Tornado Detection with Support Vector Machines

Author: B. Schölkopf
C. Cortes
C. Marzban
C.J.C. Burges
D.B. Stephenson
F. Girosi
R. Collobert
T. Evgeniou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Abstract. The National Weather Service (NWS) Mesocyclone Detec-tion Algorithms (MDA) use empirical rules to process velocity data from the Weather Surveillance Radar 1988 Doppler (WSR-88D). In this study Support Vector Machines (SVM) are applied to mesocyclone detection. Comparison with other classification methods like neural networks and radial basis function networks show that SVM are more effective in meso-cyclone/tornado detection.

CiteSeerX

Crossref

Inferring latent task structure for Multitask Learning by Multiple Kernel Learning

Author: B Schölkopf
C Chang
C Leslie
Christian Widmer
F Bach
G Rätsch
G Schweikert
Gunnar Rätsch
H Daumé
H Daumé III
J Blitzer
J Robinson
L Bottou
L Jacob
L Jacob
M Kloft
Nora C Toussaint
P Gehler
R Caruana
S Sonnenburg
Schuller Ben-David
T Evgeniou
T Evgeniou
T Joachims
V Vapnik
Y Xue
Yasemin Altun
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The lack of sufficient training data is the limiting factor for many Machine Learning applications in Computational Biology. If data is available for several different but related problem domains, Multitask Learning algorithms can be used to learn a model based on all available information. In Bioinformatics, many problems can be cast into the Multitask Learning scenario by incorporating data from several organisms. However, combining information from several tasks requires careful consideration of the degree of similarity between tasks. Our proposed method simultaneously learns or refines the similarity between tasks along with the Multitask Learning classifier. This is done by formulating the Multitask Learning problem as Multiple Kernel Learning, using the recently published <it>q</it>-Norm MKL algorithm. Results We demonstrate the performance of our method on two problems from Computational Biology. First, we show that our method is able to improve performance on a splice site dataset with given hierarchical task structure by refining the task relationships. Second, we consider an MHC-I dataset, for which we assume no knowledge about the degree of task relatedness. Here, we are able to learn the task similarities<it> ab initio</it> along with the Multitask classifiers. In both cases, we outperform baseline methods that we compare against. Conclusions We present a novel approach to Multitask Learning that is capable of learning task similarity along with the classifiers. The framework is very general as it allows to incorporate prior knowledge about tasks relationships if available, but is also able to identify task similarities in absence of such prior information. Both variants show promising results in applications from Computational Biology.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Vector Field Learning via Spectral Filtering

Author: A. Argyriou
A. Smola
A.N. Tikhonov
H.W. Engl
L. Devroye
M.L. Stein
T. Evgeniou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

Using Both Latent and Supervised Shared Topics for Multitask Learning

Author: A. Torralba
C. Wang
D.G. Lowe
D.M. Blei
I. Biederman
J. Zhang
K.D. Bollacker
R. Ando
R. Caruana
R. Jenatton
R.-E. Fan
S. Ben-David
S. Bickel
S.J. Pan
T. Evgeniou
T. Evgeniou
Y. Xue
Y.W. Teh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Multi-Target Prediction: A Unifying View on Problems and Methods

Multi-target prediction (MTP) is concerned with the simultaneous prediction of multiple target variables of diverse type. Due to its enormous application potential, it has developed into an active and rapidly expanding research field that combines several subfields of machine learning, including multivariate regression, multi-label classification, multi-task learning, dyadic prediction, zero-shot learning, network inference, and matrix completion. In this paper, we present a unifying view on MTP problems and methods. First, we formally discuss commonalities and differences between existing MTP problems. To this end, we introduce a general framework that covers the above subfields as special cases. As a second contribution, we provide a structured overview of MTP methods. This is accomplished by identifying a number of key properties, which distinguish such methods and determine their suitability for different types of problems. Finally, we also discuss a few challenges for future research

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

Multiple functional regression with both discrete and continuous covariates

Author: A Cuevas
C Preda
F Ferraty
F Rossi
G He
G James
H Cardot
H Lian
H Matsui
H. Cardot
H.G. Müller
J Ramsay
J Ramsay
J Ramsay
J. Faraway
J.M. Chiou
L Prchal
MJ Valderrama
T Evgeniou
Publication venue: Physica-Verlag/Springer
Publication date: 16/06/2011
Field of study

International audienceIn this paper we present a nonparametric method for extending functional regression methodology to the situation where more than one functional covariate is used to predict a functional response. Borrowing the idea from Kadri et al. (2010a), the method, which support mixed discrete and continuous explanatory variables, is based on estimating a function-valued function in reproducing kernel Hilbert spaces by virtue of positive operator-valued kernels

HAL - Normandie Université

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Robustness and Generalization

We derive generalization bounds for learning algorithms based on their robustness: the property that if a testing sample is "similar" to a training sample, then the testing error is close to the training error. This provides a novel approach, different from the complexity or stability arguments, to study generalization of learning algorithms. We further show that a weak notion of robustness is both sufficient and necessary for generalizability, which implies that robustness is a fundamental property for learning algorithms to work

arXiv.org e-Print Archive

CiteSeerX

Crossref

ScholarBank@NUS

Genetic sequence-based prediction of long-range chromatin interactions suggests a potential role of short tandem repeat sequences in genome organization

Author: A Malaspina
A Sanyal
A Tanay
BE Boser
C Elkan
C Widmer
E de Wit
E Lieberman-Aiden
F Ay
G Rätsch
G Rätsch
H Hamada
J Dekker
J Dekker
J Dostie
J Harrow
JO Yáñez-Cuna
JR Dixon
JR Hughes
KJ Brookes
L Jacob
M Simonis
MJ Fullwood
MJ Zeitz
N Cope
N Heidari
N Varoquaux
Nico Pfeifer
P Meinicke
P Vogt
R Edgar
S Ramamoorthy
Sarvesh Nikumbh
SSP Rao
T Evgeniou
T Evgeniou
T Lingner
TD Schneider
WA Bickmore
Z Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Efficient Training of Graph-Regularized Multitask SVMs

Author: A. Torralba
C. Cortes
D. Bertsekas
K.R. Müller
M. Kloft
R. Fan
R.M. Rifkin
S. Sonnenburg
S. Sonnenburg
T. Evgeniou
T. Joachims
T.W.T.C.C. Consortium
W. Samek
Y. Xue
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

We present an optimization framework for graph-regularized multi-task SVMs based on the primal formulation of the problem. Previous approaches employ a so-called multi-task kernel (MTK) and thus are inapplicable when the numbers of training examples n is large (typically n < 20,000, even for just a few tasks). In this paper, we present a primal optimization criterion, allowing for general loss functions, and derive its dual representation. Building on the work of Hsieh et al. [1,2], we derive an algorithm for optimizing the large-margin objective and prove its convergence. Our computational experiments show a speedup of up to three orders of magnitude over LibSVM and SVMLight for several standard benchmarks as well as challenging data sets from the application domain of computational biology. Combining our optimization methodology with the COFFIN large-scale learning framework [3], we are able to train a multi-task SVM using over 1,000,000 training points stemming from 4 different tasks. An efficient C++ implementation of our algorithm is being made publicly available as a part of the SHOGUN machine learning toolbox [4]

Crossref

MPG.PuRe

ProDiGe: Prioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

Author: A Su
B Brancotte
B Calvo
B Linghu
B Liu
B Schölkopf
B Schölkopf
B Schölkopf
C Giallourakis
C Perez-Iratxeta
C Son
CC Chang
EA Adie
F Denis
F Mordelet
Fantine Mordelet
FS Turner
G Lanckriet
GRG Lanckriet
J Freudenberg
Jean-Philippe Vert
K Bleakley
K Lage
L Jacob
L Jacob
LC Tranchevent
M van Driel
N López-Bigas
N Tiffin
O Vanunu
P Pavlidis
RI Kondor
S Aerts
S Köhler
S Yu
T De Bie
T Evgeniou
T Hwang
U Ala
V McKusick
X Wu
Y Yamanishi
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Elucidating the genetic basis of human diseases is a central goal of genetics and molecular biology. While traditional linkage analysis and modern high-throughput techniques often provide long lists of tens or hundreds of disease gene candidates, the identification of disease genes among the candidates remains time-consuming and expensive. Efficient computational methods are therefore needed to prioritize genes within the list of candidates, by exploiting the wealth of information available about the genes in various databases. Results We propose ProDiGe, a novel algorithm for Prioritization of Disease Genes. ProDiGe implements a novel machine learning strategy based on learning from positive and unlabeled examples, which allows to integrate various sources of information about the genes, to share information about known disease genes across diseases, and to perform genome-wide searches for new disease genes. Experiments on real data show that ProDiGe outperforms state-of-the-art methods for the prioritization of genes in human diseases. Conclusions ProDiGe implements a new machine learning paradigm for gene prioritization, which could help the identification of new disease genes. It is freely available at <url>http://cbio.ensmp.fr/prodige</url>.</p

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Directory of Open Access Journals